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1.  Introduction 

Many  Al  sysicnis  that  pcrfnrm  a  "heuristic  search"  (i.e.  they  can  be  tlioiight  of  as  searching  some  space  of 
possibilities  for  an  answer)  are  based  upon  one  or  both  of  tw»)  programming  techniques  known  as  constraint 
propagation  and  hypothesize- and- test. 

In  a  system  based  on  constraint  propagation,  internal  data  sinicturcs  represent  (implicitly  or  explicitly) 
potentially  acceptable  points  in  the  search  space.  Computation  proceeds  in  narrowing  down  these 
possibilities  by  employing  knowledge  of  the  domain  in  the  structure  of  tire  computation.  There  is  not  enough 
space  here  to  properly  introduce  the  concepts  involved  in  constraint  propagation.  ITie  reader  is  referred  to 
some  systems  described  in  tlie  literature  (I,  10)  for  an  introduction.  One  point  we  wish  to  emphasi/e  about 
pure  constraint  propagation  is  that  at  any  time  the  internal  data  structures  will  be  consistent  with  any  solution 
to  tlic  problem.  Thus,  if  more  than  one  solution  is  possible,  pure  prop.igation  of  constraints  will  be  unable  to 
select  only  one  of  them.  Turther,  even  if  a  unique  solution  exists,  a  constraint  propagation  system  may  not  be 
able  to  find  it. 

The  hypotbesi/e-and-test  methodology  allows  the  program  to  make  assumptions  that  narrow  the  sire  of  the 
se.irch  space;  there  is  no  guarantee  tli.it  the  assumption  is  consistent  with  any  solution  to  the  problem.  Ihe 
progr.im  continues  to  make  hypotheses  until  a  solution  is  located  or  it  has  been  determined  that  no  solution  is 
possible  wiih  the  current  set  of  assumptions.  There  is  no  requirement  that  aiv  hypothesis  be  correct  and  so 
mech.inisms  must  be  avail. ible  that  prevent  commitment  to  .my  hypotliesis  until  It  has  been  demonstrated  to 
be  .icccpt.ible.  The  most  commonly  avall.ihle  mech.mism  is  known  as  backiracking.  Hackiracking  allows  the 
program  to  return  to  an  environment  that  would  exist  h.id  th.it  .issumption  not  been  made. 

As  long  .IS  die  search  space  is  enumer.ible  (a  very  weak  assumption)  hypothesi/o  and  test  can  be  c.isily  seen  to 
be  logically  more  powerful.  If  there  arc  several  consistent  solutions.  ,i  pure  consiraint  prop.igation  system  h.is 
no  w.iy  to  csi.ihlish  preference  for  one  of  them.  Tven  if  only  one  soliiiioii  is  possible  .i  consii  lint  prop.ig.ition 
sysieni  will  not  neccss.irily  find  it;  this  will  be  demonstr.itcd  l.iter  by  ex.implc  The  proponents  of  constraint 
prop.ig.ition  point  out  th.it  hypothcsi/e-.md  tcst  is  grossly  ineflicicnl  in  situ.itions  where  constr.iint 
prop.ig.itioii  c.iii  function  (sec  for  example  W.ilt/  [l.■’|).  I  he  ev.imple  in  this  p.ipcr  be.iis  out  this  cl.iim.  though 
one  recent  study  |l|  suggests  u  re  .ire  situ.itions  in  which  pure  b.ickir.icking  is  more  efficient  tli.in  constraint 
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Onc  c;in,  however,  imagine  a  composite  system  that  has  aspects  of  botli  constraint  propagation  and 
h>()oihesi/e-and-test.  In  such  a  sjstem,  constraint  propagation  can  be  used  to  prune  the  search  space,  yet 
allowing  h>poihesi/c-aiKl-tcst  to  continue  the  search  where  constraint  propagation  is  not  able  to.  A  constraint 
language  tliat  can  support  tile  creation  of  such  svstems  has  been  constructed  by  Steele  [11],  Steele  allows 
assumpiions  to  be  made  and  backtracking  performed.  The  current  work  discusses  another  such  system  in 
which  tlie  h\pothcsi/e-and-tcst  metliodology  allows  more  than  one  assumption  lo  be  pursued  concurrently.  It 
is  an  extension  of  earlier  work  discussing  parallel  problem  sttlving  systems  [6,  7)  and  a  language,  Kther,  for 
implementing  these  systems.  Here  we  examine  one  particular  kind  of  search  problem,  cryplarilhmelic 
adJiium.  of  the  sort  used  by  Newell  and  Simon  19).  We  study  this  problem,  not  because  it  is  interesting  in 
itself,  but  because  it  is  well-defined  and  test  cases  are  relatively  easy  to  come  by.  I'his  allows  us  to  test  the 
efficiencies  of  algorithms  empirically.  We  have  constructed  a  parallel  problem  solver  for  doing  these 
cry  pt.nithmctic  problems. 

'I’here  are  two  main  points  we  wish  to  make; 

1.  1  hat  a  system  combining  both  constraint  propagation  ;md  parallel  hypothesi/e-and-lcst  methodologies  can 
be  constructed.  The  code  is  simple  to  read,  write,  and  understand.  Example  code  is  presented. 

2.  I  hat,  on  the  average,  a  parallel  program  for  .solving  these  pu/./lcs  can  be  constructed  that  requires  less 
avci  cige  run  time  when  the  parallel  program  is  exeemed  by  linie-slicinyi  on  a  single  processor  than  a  sequential 
progr.im  executed  on  the  same  processor.  Obviously,  it  matters  which  sequential  and  which  parallel  program 
we  comp.iie;  the  benchmarks  for  this  comparison  will  be  explained  later  and  are,  1  think,  quite  reasonable. 
Hie  cpcedu|i  we  are  talking  about  here  is  not  large,  but  is  noticeable.  I  hc  important  point  is  that  it  is  present 
at  .ill  A  simil.ir  effect  has  been  noticed  in  other  studies  for  various  prohiems  [5,  7J.  It  suggests  that 
coiKiiireiicy  may  be  ,i  ii'-eful  for  the  design  of  heuristic  search  algorithms  whether  or  not  the  programs  arc 
exei  iiteil  on  concurrent  hardw.ite  or  a  conventional  se(|uenti,il  compittcr. 

I  he  leni.iiiuler  of  this  [i.iper  consists  of  a  discussion  of  the  piohlem  being  soKcd  and  the  nature  of  the  parallel 
Milution  Vke  show  how  the  efficiency  oftlie  par.illel  progr.nii  de(iends  oti  the  use  of  heuristic  inform, ition  for 
.illoi  .mug  le  ooices  of  the  p.irallel  |irogram.  We  then  develop  ,i  seiies  of  .illoc. ition  str.itegics,  e.ich  one 
iiiipou me  (.11  tile  pie ( ions  one.  We  lln.illv  discuss  tlie  impoi t.iiice  of  this  experiment  for  .i  genet, il  theory  of 
(lo.likin  sob  nig  VS.  show  how  the  .illoialion  sti.iti;  i.  s  iipii'seni  ,i  use  of  wh.it  li.is  been  c.illed  nuUilrvel 
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knowledge  in  the  literature,  i.c.  knowledge  about  how  to  guide  the  search  process  to  gain  efficiency.  In  this 
study,  concurrency  is  necessary  to  make  use  of  this  meta-lcvcl  knowledge. 

2.  The  problem 

We  are  given  three  strings  of  letters,  e.g.  "DONALD",  "GERALD",  and  "ROBERT"  that  represent  integers  when 
substitutions  of  digits  are  made  for  each  of  the  letters.  There  is  at  least  one  possible  assignment  of  digits  for 
letters  so  that  tlie  numbers  represented  by  the  first  two  ("DONALD"  and  "GERALD"),  when  added,  yield  the 
number  represented  by  the  third  ("ROBERT").  Any  one  of  these  assignments  is  a  solution.  In  the  problems 
we  will  be  looking  at,  each  will  contain  exactly  ten  letters.  A  solution  consists  of  a  mapping  from  these  ten 
letters  onto  the  ten  digits  0  tlirough  9. 


3.  A  Constraint  Propagation  Solution 

In  our  construction  of  the  constraint  network  we  will  use  the  actor  model  of  computation.  We  find  it  a  very 
natural  formalism  for  building  these  sorts  of  systems.  In  this  formalism  nodes  of  the  network  are 
implemented  as  actors.  Constraint  propagation  between  nodes  is  implemented  by  sending  of  messages 
containing  the  new  constraints  to  the  node  being  constrained.  For  our  cryptaritlimctic  problem  solver  we 
have  three  kinds  of  nodes;  letters,  digits,  and  columns.  They  arc  arranged  as  shown: 


Jm, 
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Arcs  in  the  diagram  indicate  ilnw  of  constraints.  Thus  colim)n  nodes  can  c  instrain  their  left  and  right 
neighbor  columns  and  certain  lei:cr  nodes  (the  ones  representing  letters  contained  within  the  column).  Letter 
nodes  c.m  constrain  digit  nodes  and  column  nodes  that  contain  dicir  respective  letters.  Digit  nodes  can 
constrain  letier  nodes.  In  the  initial  configuration,  before  constraint  propagation  begins,  we  store  at  each 
letter  node  a  list  of  possible  digits  tliat  t ont.iins  all  ten  digits.  Similarly,  each  digit  node  contains  a  "possible 
letters  list"  containing  all  ten  letters.  We  w  ill  give  a  short  description  of  what  each  node  has  to  do  when  it 
receives  a  message  informing  it  of  a  new  constraint. 

Columns.  A  column  can  receive  messages  informing  it  of  new'  constraints  on  letters  it  contains  and  on 
possible  values  lor  its  cairy  in  and  carry-out.  If  a  column  node  receives  any  such  messages,  it  computes 
possible  new  constraints  on  its  letters,  carry-in,  and  carry-out.  If  any  one  of  these  has  no  possible  values  a 
CONTRADICTION  is  asserted.  When  a  CONTRADICTION  is  asserted  the  code  implementing 
hypotbc'ii/e  and-test  is  invoked  to  uikc  an  appropriate  action.  New  constraints  on  letters  arc  sent  to  the 
rcspcLtive  Liter  itodcs.  New  constraints  on  carry-in  and  carry-out  arc  sent  to  the  right  and  left  neighbor 
columns  respectively. 

Letters.  I  etters  receive  messages  that  indicate  subsets  of  the  digits  0  through  9  that  they  can  possibly  be.  If 
they  learn  of  digits  ih;it  they  cannot  be,  nodes  representing  those  digits  are  sent  messages.  Also,  each  column 
that  contains  the  letter  receives  a  new  message  informing  it  of  die  new  restrictions  on  the  value  of  the 
partictilar  letter.  If  the  .set  of  po.ssible  digits  becomes  null,  .i  CONTRADICTION  is  as.scrtcd. 

Digits.  ITicse  receive  messages  from  letter  nodes  indicating  that  they  arc  or  arc  not  the  respective  letter.  If 
the  set  of  [lossihlc  letters  is  reduced  to  a  singleton,  a  message  is  sent  to  the  particular  letter.  If  the  set  of 
possible  letters  is  reduced  to  null,  a  CONTRADICTION  is  asserted. 

\Vc  can  observe  some  things  .ibout  the  ability  of  this  system  to  s.iiisf,icioiily  derive  a  unique  solution,  Lii'st,  if 
there  is  more  than  one  possible  solution  it  will  not  find  any  of  then.  Since  the  letter  and  digit  assignments  of 
each  possible  solution  arc  certainly  possible  assignments,  they  will  .iiipcai  on  the  possibility  lists  attached  to 
each  node.  A  fact  that  is  not  so  easy  to  check  by  inspection,  but  which  is  easily  dcmunsir.ible  empirically,  is 
that  even  if  there  is  only  one  possible  solution  (oi  no  possible  solutions)  the  system  ni.iy  not  find  it  (or 
discover  that  no  solutions  exist).  Nevertheless,  the  knowledge  can  be  s.iid  to  be  "present"  in  ;hc  network;  if 
the  nodes  of  the  network  aie  irist..ntiated  with  .in  ,|s^lgnnlent  ol  lelers  to  digits,  the  network  will  assert  a 
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CONTRADICTION  iff  tlic  assignmcnl  is  not  a  solution.  Our  constraint  network,  tlicn,  needs  the  ability  to 
make  assumptions  and  test  them  if  it  is  to  be  able  to  solve  these  puzzles. 

4.  llypolhcsi/c  and  Test  in  Ether 

I  hc  constr.iir  network  and  hypoihesi/e-and-tesi  mciliodologies  were  written  in  tlie  I'.ther  language  [6,  7).  We 
will  only  give  enough  details  about  the  implementation  to  support  the  ensuing  discussion.  ITic  interested 
reader  is  referred  to  (8|  for  a  more  deUiiled  discussion  of  tlic  implementation. 

The  primiti\e  operations  of  the  l-ther  languages  arc  based  around  die  notion  of  an  assertion  rather  than 
message  passing.  Rather  tlian  coding  in  a  message  passing  formalism  "Send  the  node  for  the  letter  O  that  is 
5"  wc  instead  say  "Assert  that  D  is  5"  and  a  process  of  compilation  turns  this  asscrtional  code  into  a  message 
passing  implementation.  I-'or  certain  problems  tliis  process  of  compilation  is  important  because  certain  ideas 
c.m  he  expressed  quite  naturally  in  tlic  asscrtional  fonn  that  compile  into  very  complex  message  passing  code. 
I'hese  issues  will  be  discussed  in  (8). 

Ikcausc  wc  arc  interested  in  the  possibility  of  pursuing  more  than  one  instantiation  of  the  constraint  network 
in  parallel,  wc  need  the  ability  to  have  more  than  one  available  for  processing.  Kor  this  wc  introduce  the 
notion  of  a  viewpoint.  Kach  viewpoint  tags  a  mutually  compatible  collection  of  assumptions  about  the 
possible  values  of  letters  and  digits  together  with  the  constraints  tliat  derive  via  propagation  from  these 
assumptions,  (i.c.  a  viewpoint  is  one  particular  instantiation  of  the  network).  Viewpoints  arc  related  to  each 
other  by  an  inheritance  mechanism.  I'hc  viewpoint  in  which  A  is  assumed  to  he  5  and  B  is  assumed  to  be  4 
might  be  a  suhview  point  of  the  one  in  which  A  is  assumed  5  and  no  other  assumptions  have  been  made. 
View;)omts  are  the  repositories  of  assumptions  and  tacts  derived  from  these  assumptions. 

In  order  to  be  able  to  hypothesi/e  and  test  wc  need  to  introduce  some  control  primitives.  I'hese  primitives  arc 
built  .iround  a  construct  known  as  an  activity.  All  processitig  that  happens  during  execution  hnppctis  under 
the  auspices  of  some  activity.  There  arc  language  constructs  for  coiueniciitly  grouping  parts  of  a  rcitilcd  task 
into  a  single  activity.  Tor  cx.imple,  wc  can  create  an  activity,  ni.ikc  a  new  assumption  in  a  viewpoint,  and 
c.iuse  .ill  further  work  within  the  viewpoint  (i.e.  all  further  cotisir.iint  passing  in  the  instance  of  the  network 
dellneil  by  the  .issutiiiiiion)  to  lie  part  of  the  .ictivity. 
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Acti\itios  arc  cif  interest  because  llicy  give  us  ways  to  control  quantities  of  system  resources  available  for  the 
execution  of  alternative  explorations.  If  we  stifle  an  actixity,  all  execution  with  the  activity  stops;  a  stifled 
activity  cannot  be  restarted.  We  also  hive  the  ability  to  control  the  rates  tJiat  non-stiflcd  activities  run. 
Different  activities  can  be  assigned  different  amounts  of  processing  power,  the  total  amount  of  CPU  time  an 
activity  will  get  during  an  interval  of  time  is  proportional  to  its  processing  power.  The  processing  power  of  an 
activity  can  be  altered  by  the  sy  stem  asynchronously  with  the  running  of  the  activ  ity. 

Systems  using  hypothesi/e-and-test  can  be  constructed  in  f-thcr  by  using  viewpoints  to  represent  assumptions 
made,  and  uciiviiies  to  control  which  parts  of  the  search  space  arc  explored,  and  w  ith  what  vigor. 

5.  A  Simple  Parallel  Solution 

In  tJiis  treatment  we  w  ill  ignore  many  details  of  how  both  the  I'thcr  system  and  the  cry  ptarilhmctic  system 
implemented  within  it  arc  constructed.  If  we  wish  to  "create  a  new  instance  of  die  constraint  network”  that 
inherits  from  another,  we  create  a  new  viewpoint  (using  the  new-viewpoint  construct).  To  add  an  assertion 
about  a  letter  being  associated  with  a  digit  within  the  context  of  tltis  viewpoint,  wc  execute 
(assert  (one-of  -letter  (-digit)))  where  letter  and  digit  arc  bound  to  the  respective  letter  and  digit 
which  wc  want  to  assume  are  identified  in  this  view  point.  'I'lic  second  argument  to  one-of  is  a  list  of  possible 
digits  that  the  letter  can  be.  So,  for  example,  wc  could  execute  (assert  (one-of  s  (i  3  5  7  9)))  to  indicate 
that  S  is  odd.  lithcr  syntax  nuikcs  use  of  a  (?(/(»/- r/no/c  convention  in  whicli  symbols  prefaced  by  the  character 
arc  substituted  with  the  values  of  the  associated  symbols.  If  letter  were  bound  to  ”1)"  and  digit  were 
bound  to  "5".  the  item  actually  asserted  would  be  (one-of  D  (5)).  If  the  assert  is  executed  within  the 
context  of  a  ceitain  activity,  then  all  work  propagating  constr.iints  Uiat  follow  from  that  assertion  will  happen 
within  that  activity. 

The  implementation  described  in  this  section  is  quite  simple.  It  first  creates  a  viewpoint  in  which  no 
assumptions  are  made  and  continues  projiagating  constraints  within  this  viewpoint  until  it  h.is  iinicscej.  i.e.  no 
more  iiropagation  can  happen.  When  tliis  state  h.is  been  achieveil.  if  e.ich  letter  docs  not  h.ive  ,1  unique  digit 
that  it  can  be  iilentified  with,  it  is  determined  which  letter  h.is  the  least  niimhei  of  possible  digits  th.it  it  c.in  be 
(excluding  those  letters  th.ii  .ilie.idy  havea  iuii(|uc  assignment),  loreach  one  of  these  digits,  .1  new  viewpoint 
and  a  new  activity  are  cre.ticii.  Within  these  (in  p.u.illel).  the  letter  is  .isserted  (.issumed)  to  be  the  digit  and 
propagation  of  constraints  continues.  If  qiiiescciice  is  rc.iched  in  this  new  .niivily  .iiul  the  proMein  h.is  not 


been  solved,  we  rccurse. 


I'he  function  shown  below  takes  a  letter,  a  list  of  alternative  digits,  and  a  viewpoint.  It  uses  the  environment 
contained  in  the  viewpoint  to  create  new  subvicwpoinis  in  which  the  letter  is  assumed  to  be  each  of  the 
alternative  digits.  We  first  check  to  sec  if  there  is  at  least  one  possible  digit.  If  not,  there  cannot  be  a  possible 
soluiion  to  this  problem  consistent  with  die  parent  viewpoint  and  so  we  assert  that  there  is  a  contradiction 
vv  ithin  tile  parent  viewpoint.  Otherwise  we  iterate  over  each  digit  in  tlie  altci natives  list  and  for  each  one  we 
create  a  new  viewpoint  whose  parent  is  die  parent  viewpoint  and  a  new  activity  with  parent  start-act  and 
assert  die  letter  is  the  particular  digit;  diis  initiates  propagation  of  constraints.  If  we  discover  there  is  a 
contradiction  within  the  viewpoint  (this  is  accomplished  by  the  code  fragment  beginning  with 
"(when  {(contradiction)}")  we  asscrt  within  die  parent  viewpoint  that  the  letter  cannot  be  the  particular 
digit.  We  are  justified  in  doing  this  because  the  only  difference,  in  terms  of  assumptions  made,  between  the 
cuncni  viewpoint  and  the  parent  viewpoint  is  the  one  assumption  of  the  letter  being  identified  with  a 
particular  digit  that  was  a  possible  alternative  in  the  parent  viewpoint;  if  this  assumption  leads  to  a 
contradiction,  we  know  diat  this  is  not  a  possible  identification  for  die  letter.  In  addition  we  stifle  (stop  from 
executing)  die  activity  that  was  pursuing  die  now  known  to  be  invalid  a.s.sumption.  We  further  check  to  see  if 
the  activity  quiesces  in  the  section  of  code  beginning  with  "(when  {(quiescent  If  this  has  wcurred, 

we  first  check  to  sec  if  the  problem  has  been  solved.  If  so  we  are  done;  otherw  ise  wc  determine  the  letter  in 
the  viewpoint  with  the  least  number  of  possible  digits  (but  greater  than  1)  and  recursively  call  parai  lei -solve 
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(defun  paral 1  el -so  1 ve  (letter  alternatives  parent-viewpoint) 

(if  (null  alternatives) 

;  //  rlii  it  nil-  no  \iublo  alicrniinycs.  the  then-  is  no  consisicnl  nssiptment  possible 
( w I th i n- V iewpo in t  parent-viewpoint  (assert  (contradiction))) 

,()ilkisusi\  fork  on  auh  iiliirnalive 
(  f  0  r  e  a  c  h 
digit 

al ternatives 

(let  ((v  (  new-v  i  ewpoint  parent  parent-viewpoint)) 

(a  (  new-ac  t  i  V  i  ty  parent  start-act))) 

(within-vi ewpoint  v 

(assert  (one-of  ->letter  (->digit))) 

(activate 

(when  {(contradiction)} 

(with  in- viewpoint  parent-viewpoint 
(assert  (cant-be  -"letter  -"digit))) 

(stifle  a)))) 

(activate 

;  If  tin  aermti  has  iimeseed  we  ntust  first  eheek  if  the  problem  has  been  solseil:  if  so.  nr  are  done. 
■.(>lhen\isi  n<  miisl  pick  a  iirii  branch  to  j;n  donn  in  a  depth  first  fashion. 

(when  {(quiescent  -"a)} 

(if  ( total -solution  ( quiescent- 1 etter-cons traints  v)) 

(halt-ether)  :  Indicates  nr  are  done 

(let  ((tninpair  (minimum  #'(lambda  (pair) 

(let  ((length  (length  (cadr  pair)))) 

(if  (=  length  1) 

11. 

1 englh) ) ) 

(quiescent-letter-constraints  v)))) 

(paral lel-solve  (car  rainpair)  (cadr  minpair)  v))))))))) 


with  dli'cad)  existing  activities,  rhc-dcfault  allocation  of  processing  power,  when  no  exiilicit  allocation  has 
liccti  (lone,  is  sncIi  that  each  running  activity  gets  approximately  equal  servicing  (in  terms  of  CPU  seconds)  by 
the  scheduler. 

6.  Allcrnalivc  [’arallcl  Programs 

1  he  simple  pai  .illel  program  described  might  well  be  reasonable  if  we  had  a  large  number  of  pr(X.'cssors.  With 
a  small  numher  of  processors  (in  particular,  only  one  processor,  the  case  actually  studied)  it  is  considerably 
less  efficient  in  terms  of  aierace  total  run  time  than  some  other  solutions.  ,MI  the  solutions  we  will  examine 
are  elaborations  of.  or  simple  modiricalions  to  the  basic  parallel  (irogr.im  already  presented. 

We  obserse  tliai  a  traditional  depth-first  sc.irch  (with  backtracking)  is  but  a  trivial  modidcation  of  the  code 
,ibove.  When  new  alicrnative  digits  arc  proposed  for  a  letter,  inste.id  of  starling  them  up  in  concurrent 
viewpoints  as  was  done  above,  they  arc  placed  on  .i  list.  Only  tlie  .ictiviiy  for  the  first  one  on  the  list  is  given 
.my  piocessing  power.  If  it  i|iiiesi;es  we  recursively  c.ill  paral  in  i  solve.  If  it  is  discovered  that  the  view  point 
IS  lonii.idhloiv,  ilie  next  one  is  heenn  (if.i  next  one  exists);  otlieiwise,  the  p.nem  viewpoint  is  .isseited  lobe 


inconsisicni.  Asserting  that  it  is  inconsistent  will  trigger  the  activity  monitoring  the  next  higher  viewpoint  to 
pick  the  next  possibility  on  its  list.  Depth-first  is  a  degenerate  case  of  parallel  search  in  which  only  one 
acto  i(>  at  a  time  is  giv  en  non-zero  processing  power. 

6.1  Using  Heuristic  Inforinatiun  to  C  ontrol  Resource  .Vllocation 

A  simple  claborativ)!!  we  can  make  to  tlie  parallel  iinpicmcntation  presented  tliat  preserves  its  parallel 
character  is  to  varv  tire  processing  pvtwer  based  on  an  assessment  of  how  likely  tlie  assumptions  we  have  made 
within  its  associated  viewpoint  arc  to  lead  to  useful  information  (either  leading  to  a  solution  or  determining 
that  the  viewpoint  is  contradictory).  We  base  the  quantity  of  processing  power  allocated  to  the  activity  doing 
the  exploration  on  the  numerical  value  of  tliis  judgement,  l-'or  tliis  particular  problem,  we  are  more  likely  to 
learn  in  a  short  period  of  time  whether  a  viewpoint  contains  a  valid  solution  or  is  contradictory  if  it  is  already 
f.iiiK  well  constrained,  i.c.  if  die  letters  in  the  viewpoint  only  have  a  few  possible  digits  that  they  could  be. 
After  some  cxpcnmcnliHkm  we  came  upon  the  following  formul.i  for  determining  relative  processing  power 
allocations  for  the  various  different  activities  participating  in  the  search: 

((l0-n^)2  4-  ...  -f  (10-njo)^'2 

where  each  nj  is  die  number  of  possible  digit  assignments  for  the  letter  i  in  die  viewpoint.  If  the  letters  tend  to 
have  fewer  possible  digit  possibilities,  die  sum  terms  (10  -  iij)  will  tend  to  be  large.  Squaring  this  number,  and 
squaring  the  final  sum  serves  to  accentuate  the  relative  differences  between  die  difl'ercnt  viewpoints.  When 
the  system  is  first  set  up,  a  separate  activity  known  as  the  nniiui^cr  tn  iiviiy  continually  monitors  each  of  the 
other  running  activities  and  evaluates  this  function  for  each  associaicvl  viewpoint.  Ihe  prcKCSsing  power 
alloc. iiions  to  these  activities  are  adjusted  in  proportion  to  the  numeiic.il  v,ilue  of  this  formu' :  I'lic  I'thcr 
ci'iiini.iiul  we  use  for  mollifying  the  processing  power  .illoc.itions  of  ,iii  .ictiviiv  is  c.illed  support-in-rattos. 
I(  (  ikes  ihrce  .iigumcnts:  an  activity,  a  list  of  .ictiviiies  (th.il  are  childien  of  the  first)  and  a  list  of  non-neg.itivc 
loiiil'cis  with  ihe  same  number  of  elements  as  the  list  of  activities.  Hie  [irocessing  power  .i.ssigned  to  die 
I'.iicni  activity  is  (le)divided  among  the  children  activities  in  proportion  to  the  tuimbcis  in  this  list.  I'litis,  if  a 
l.icior  for  .1  given  .ictivity  is  0  the  activity  gets  no  processing  power;  if  the  f.ictor  .issoiiated  with  the  activity  is 
twice  the  f.ictor , issue i.ited  with  another,  then  the  former  activity  gets  twice  as  much  processing  power  .is  the 
l.ittei  I  he  .illoc.itor  described  is  implemented  ;is  follows; 
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(ilufunc  Gquaie-both-al  locator  () 

(suppnrt-in-ratios 
pur:  HI  start-act 

Ik  li'.tii,  ^  currently-exploi'ing-activities 
Jik-U:r\  (fori  is t 
vpt 

ciirrently-explored-viewpoints 

(let  ((status  ( qu iescent- letter-cons tra in ts  vpt)) 

(sum  1 ) ) 

( f  oreach 
pair 
s latus 

(increment  sum  (expt  (-  10.  (length  (caor  pair)))  2))) 
(max  (expt  sum  2)  1))))) 


W'c  LTO.ltC  a  separate  aell'lts  at  tiip-lc\ol  callcel  llio  manager-activity  and  C’XCeiilO  the'  rullowillg  ti)  ha\C  ihc 

alldcatioii  stratce.N  eoniiiuialii  ealled  as.incliroiicusly  with  the  .letisiiies  doing  ihe  aetii.i!  search; 

(wit.hin-activity  manager-activity 
(continuous  ly-execute 

(fun  cal  1  #'square-Doth-allocator))) 

llie  manager-activity  is  gi\cn  a  |irocessing  power  ot'.l  (iTieaiiiiig  it  will  use,  on  ilie  aicrage,  a  tenth  of  the 
total  t'I’L'  time  for  the  entire  run). 

(iiis  scheme  gi\es  coiisiderahK  better  performance  than  the  sim.ple  parallel  solution.  It  dt'cs  better  than  the 
backtracking  solution  on  some  examples  with  a  single  processor  implementation,  although  on  the  average  the 
backtracking  solution  is  more  cflicioiu.  It  is  im|)ort<int  to  understand  the  source  of  this  impro\emeiu.  We 
h.i\e  a  scheme  for  estimating  the  likelihood  tlial  a  running  actixite  will  return  useliil  information  in  a  short 
perioel  of  time.  W'e  allocate  more  resources  to  those  activities  that  we  estimate  will  su|ipl\  os  with  informebon 
foi-  the  le.ist  amount  of  resource  expenditure.  Assuming  our  heuristic  is  reasonable,  the  average  time  to 
complete  the  seaich  is  reduced. 

riieie  are  three  more  improxemenls  w'C  have  made  to  the  processing  power  allocation  strategy  before 
re. idling  the  liii.il  strategy  for  which  we  have  eolKxied  dat.i  in  Ihe  next  section.  I’adi  will  he  described  in 
turn. 


6.2  (  oin  iirreiicy  I'aclors 


V\'e  have  obsei\ed  in  the  allocation  slr.ilcgx  discussed  thus  tar  that  c\cn  tiuuigh  activities  .iic  mnning  with 
different  amounts  o(  pioccssiitg  power  that  .iiv  iclated  to  out  cstim.ilc  of  the  utility  ol  iieiliiig  uscltil 
infoim.itiiin  b.iek  Irom  them,  there  still  seems  to  he  so  m.iny  activities  nmniiig  ih.it  they  tcinl  t(i  tliiash  ag.iinst 
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onc  .mother.  We  would  like  to  limit  the  amount  of  concurrency  so  that  the  running  activities  can  get 
something  done,  l-'or  this  purpose  we  introduce  the  notion  of  a  concurrency  facivr.  Instead  of  letting  all 
runnable  activities  nin.  we  pick  the  n  most  promising  activities  (using  the  metric  above),  where  n  is  the 
concurrenev  factor,  and  give  only  those  activities  processing  power  and  in  the  ratios  defined  by  the  metric. 
The  optim.il  value  for  die  concurrency  factor  is  picked  experimentally  and  is  discussed  below. 

I  he  v.ilue  of  the  concurrency  factor  tli.it  yields  tJte  best  result  is  a  reneciion  of  two  aspects  of  the  problem;  the 
qii.ility  of  our  heuristic  knowledge  and  the  distribution  of  computational  expense  for  picking  bad  branches  in 
the  search.  Obv  iously  if  our  heuristic  knowledge  were  perfect,  i.c.  it  could  always  point  to  the  correct  branch 
to  explore  next,  the  optimal  concurrency  factor  would  be  1  --  it  should  simply  explore  tliis  best  branch.  If  wc 
are  less  sure  we  arc  about  which  is  the  best,  more  branches  should  be  explored.  Also,  if  the  computational 
cost  of  exploring  a  bad  branch  is  always  small,  a  small  concur' cncy  factor  would  be  appropriate.  If.  however, 
the  Cv'St  of  a  b.id  branch  can  be  very  large  wc  would  want  to  use  a  larger  concurrency  factor.  With  a  small 
concurrency  factor  wc  incrc.ise  the  probability  tliat  the  problem  solver  will  become  stuck  for  a  very  long  time. 
.\  limiting  case  of  this  is  w  ith  a  search  space  that  is  infinite  (introducing  the  possibility  of  a  bad  branch  that 
never  runs  out  of  possibilities)  and  a  concurrency  factor  of  1.  If  the  problem  solver  happens  to  pick  one  of 
ihcse  br.mchcs  it  will  diverge. 

1  l.iyes-Kuth  has  noted  an  analogy  with  portfolio  theory,  the  purpose  of  which  is  to  pick  an  investment  strategy 
ih.it  will  yield  the  greatest  expected  capital  appreciation.  Uncertainty  about  the  future  performance  of  certain 
iiidiiMncs  .iiul  volatility  in  the  market  place  argue  for  greater  diversification  of  die  portfolio. 

b.}  1  stimatiiig  NMiicIi  Assumptions  .\rc  Most  Valuable 

( )ui  sir.itcg;.  '-o  f.ir  h.is  been  to  use  hypollicsi/e-;ind-lcst  on  one  Idler  only  in  ctich  viewpoint.  Wc  sprout  one 
new  \icw  point  .ind  .icliv  ity  to  test  the  hypothesis  th;it  that  letter  is  each  otic  of  the  digits  it  could  possibly  he  in 
the  p.ircnt  viewpoint.  'I  bis  is  not  necessarily  the  best  str.itcgy.  lly  hypothesizing  a  letter  is  a  certain  digit  wc 
m.iy  a, nil  ,1  lot  or  a  little.  Wc  have  "learned  a  lot"  if  wc  (1)  discover  quickly  ih.at  a  viewpoint  is  contradictory, 
or  t.'’)  c.iiise  a  lot  of  constr.iint  proptigation  activity  th;it  significantly  incrctiscs  our  cviilualion  of  the  new 
'  lew  point.  One  thing  we  h.ive  observed  islh.it  the  .imoiint  we  learn  from  .issuming  a  letter  is  ti  p.irticular  digit 
Jill  s  nni  \n\iiilH  <inily  thpend  on  which  iliyil  ur’  use.  In  other  words,  if  wc  assume  the  letter  N  is  2  .ind  iliscover 
.1  conii.idiciioM.  then  we  are  likely  to  either  discover  a  conlr.uliclion  or  signlic.inlly  conslr.iin  our  solution  by 
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assiiniing  N  is  any  other  digit  on  its  list  of  alternatives.  I'o  take  adtaiitage  of  this  phenomenon  die  program 
remembers  viliai  happened  when  it  makes  particular  assumptions.  When  it  creates  a  new  v  iew  point  to  study 
die  result  of  assuming  a  letter  is  a  particidar  digit  the  result  is  reeurded  in  the  parent  viewpoint  when  it  has 
completed,  l  liere  are  two  possible  results.  If  it  led  to  a  contradiction  this  fact  is  lecorded.  If  it  led  to  a 
quiescent  (but  consistent)  state  it  records  the  difference  of  the  evaluation  metric  applied  to  the  parent 
viewpoint  and  the  evaluation  metric  on  die  quiescent  viewpoint  --  our  estim.ite  of  the  amount  of  reduction 
that  is  likely  to  he  obtained  by  assuming  this  letter  to  be  a  digit.  Our  new  evaluation  metric  attempts  to  take 
this  information  into  consideration.  When  assuming  a  letter  1.  is  a  specific  digit  we  use  the  old  evaluation 
metric  if  we  do  not  have  liave  never  assumed  1.  to  be  a  particular  digit  from  this  viewpoint;  otherwise,  we  use 
the  average  of  the  cvalu.itioiis  for  each  of  the  resultant  viewpoints.  We  then  multiply  this  figure  by  the  factor 
1  +  .5  *  n  where  n  is  the  number  of  letters  th.it  we  have  assumed  I  to  be  and  determined  that  dicy  lead  to 
contradictions. 

Now  th.it  we  have  a  mechanism  for  taking  advantage  of  inform.ition  learned  by  making  different  assumptions 
we  would  like  to  ensure  that  a  variety  of  choices  are  tried  at  e.ich  hr.imhinc  piitni.  We  will  slightly  modify  the 
technique  for  picking  the  activities  to  be  run  at  any  given  time  (in  accordance  with  the  concurrency  factor). 
Where  c  is  the  concurrency  factor,  wc  use  the  following  algorithm  to  pick  the  c  activities  to  run  at  a  given 
time: 

1 .  (  he  activity  vvith  die  highest  cv.iluation  is  scheduled. 

2.  If  n  <  e  activities  h.ivc  been  selected  for  running,  the  n  t  1st  activity  is  (,i)  the  one  with  the  highest  metric  if 
it  does  not  duplicate  any  of  the  first  n  activities  in  terms  of  which  letter  it  is  m. iking  an  .issumption  about  for  a 
given  viewpoint,  or  (b)  the  highest  rated  non-duplicated  activity  unless  the  highest  rated  activ  ity  has  a  rating 
at  least  three  times  higher  in  which  case  wc  use  the  highest  r.ited  .ictivity.  Ihe  factor  three  was  picked 
experimentally  and  is  based  on  the  following  argument,  'fherc  is  a  cert.iin  .idvant.ige  In  having  a  diversity  of 
letters  being  tested  because  this  gives  us  a  greater  chance  to  discover  assumptions  lh.it  will  cause  significant 
shiink.:igc  by  constraint  propagation.  I  lowcvcr.  there  is  also  an  adv.intage  to  running  the  .ictivity  di.it  we  have 
estim.ited  will  give  us  the  best  result.  I  he  factor  three  is  the  ratio  of  estimates  for  expected  g.nn  for  which  wc 
would  r.ither  run  the  higher  estim.ited  test  than  one  that  will  increase  our  diversity. 
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7.  An  FApcriment 


In  order  to  test  for  the  existence  of  a  speed-up  with  concurrency  we  timed  10  problems  using  die  final  parallel 
algoritlim  described  above  for  several  concurrency  factors.  I'he  problems  tested  are: 

1)  DONALD  +  GERALD  =  ROBERT 

2)  CRIME  +  TRIAL  =  THIEF 

3)  POTATO  +  TOMATO  =  VEGIES 

4)  MIGHT  +  RIGHT  =  MONEY 

5)  FUNNY  +  CLOWN  =  SHOWS 

6)  FEVER  +  CHILL  =  SLEEP 

7)  SHOVEL  *  TROWEL  =  WORKER 

8)  TRAVEL  4-  NATIVE  =  SAVAGE 

9)  RIVER  4  WATER  »  SHIPS 

10)  LONGER  4  larger  =  MIDDLE 

fhey  were  picked  by  a  trial-and-error  process  of  selecting  possible  problems  and  then  running  them  to  sec  if 
tliey  have  a  solution.  It  is  not  known  whether  they  have  one  or  more  than  one  solution.  The  program  finishes 
when  it  has  found  one  solution.  These  tests  were  run  on  the  MI'f  l  isp  machine,  a  single  user  machine 
designed  for  efficient  execution  of  Lisp  programs.  Tlie  times  represent  processor  run  time  only  and  arc 
adjusted  for  time  lost  due  to  paging.  'I'he  manager  activity,  which  continually  monitors  the  state  of  the  search 
activities  and  readjusts  prtKcssing  power  accordingly,  receives  a  processing  power  allocation  of  .1.  We  tested 
with  concurrency  factors  between  1  and  7.  Numbers  2  through  7  each  gave  some  improvement  with  4  being 
the  best.  1  Icrc  we  report  the  results  for  concurrency  factors  1  and  4.  Times  reported  arc  in  seconds: 


concur¬ 

concur¬ 

ratT 

rency 

rency 

f  actor 

factor 

’  1 

»  4 

1) 

377 

140 

2.69 

2) 

86 

153 

.86 

3) 

167 

192 

.87 

*) 

79 

2  46 

.32 

5) 

663 

227 

2.92 

6) 

2868 

348 

8.24 

7) 

241 

112 

2.16 

8) 

78 

335 

.23 

9) 

1920 

564 

2.55 

10) 

474 

212 

2.24 

total ; 

6952 

2519 

2.76 

With  a  concurrency  f.ictor  of  1  the  algorithm  becomes,  functionally,  a  depth-first  search.  A  concurrency 
f.ictor  of  4  represents  the  v.iluc  which  yields  least  average  nm  time  for  the  problems  examined.  Concurrency 
factors  larger  <ind  smaller  yield  higher  average  values.  We  caution  the  reader  not  to  take  tlic  numbers  too 
seriously.  We  only  wish  to  riemonstratc  that  the  par.illel  algorithm  runs  with  some  improvement  ofcITicicncy 
over  the  scqueiui.il  .ilgorithm 


Some  interesting  facts  can  be  learned  by  examining  tJic  data.  Although  the  p.irallel  solution  beat  out  the 
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j  scquciUial  solulidii  in  Diily  6  of  [lie  10  eases,  these  six  cases  are  ilie  ones  for  which  ilie  scquciuial  suiulions  lake 

the  loiigosi.  In  particular,  problems  6  and  9  have  show  by  far  [he  longest  times  for  the  sequential  solution  and 
i 

i  the  time  saving  of  the  parallel  solution  is  considerable.  Similarly,  tor  the  cases  in  which  the  sequential 

solution  finished  qui-kly,  the  par.illcl  solution  tended  to  take  longer.  I'his  phenomenon  is  fairly  easy  to 
explain.  I’he  parallel  solution  supplies  "insurance"  against  picking  bad  branches  in  the  search  space.  If  the 
sequential  solution  happened  to  pick  a  bad  branch  (or  several  had  branches)  there  was  no  recourse  but  to 
follow  it  through.  Similarly,  if  the  sequential  program  found  a  relatively  quick  path  to  the  solution,  tlic  extra 
efficiency  of  the  parallel  solution  was  not  needed. 

8.  Concliislons 

We  h.oe  dcinonstr.iied  that  cry  puiriihmctic  pu//lcs  can  be  solved  with  a  certain  increase  in  average  efficiency 
by  the  par.illel  ,ilgorithm  described  over  a  more  tr.nlitional  depth-first  search  solution.  While  this  result  in 
and  of  itself  h  of  little  use  it  does  demonstrate  a  tool  that  may  be  of  great  use  in  heuristic  programming  --  the 
use  of  p.irallclism  to  control  a  heuristic  search.  Several  writers  have  pointed  to  the  use  of  meta-level 
knoMh’Jge  (e.g.  Davis  [2])  in  controlling  a  search.  Meta-lcvcl  knowledge  is  know  ledge  about  how  to  use  Uie 
problem  solving  tools  ;it  hand  in  a  way  that  increases  overall  search  elficiency.  I  he  allocation  strategies  we 
have  ex.imined  arc  meta-level  knowledge  for  eryptarithmctic  problems,  lly  allowing  a  few  to  run  in  parallel, 
and  with  controllable  amounts  of  prrKCSsing  power  we  arc  able  to  increase  the  efficiency  of  tlie  search. 
Although  the  increase  wc  gained  is  not  dramatic  there  is  reason  to  sus|rect  that  it  would  be  more  significant  in 
more  interesting  problems.  The  si/c  of  the  search  space  in  these  problems  is  relatively  quite  small.  I'hus 
picking  a  "bad  branch"  in  the  search  can't  be  too  catastrophic.  With  a  search  space  that  is  much  larger,  and 
possibly  infinite  (as  is  the  case  with  many  interesting  problems),  a  bad  branch  using  a  p.irallel  search  can  only 
do  a  bounded  amount  of  harm,  bounded  by  the  quantity  of  processing  power  allocated  to  it. 

Wc  introduced  several  concepts  that  were  used  in  tlte  construction  of  the  allocation  strategy.  Processing 
power  is  alloctited  in  proportion  to  an  estimate  of  Inrw  likely  we  .ire  to  get  useful  information  out  of  tlic 
cxplor.ilion  of  a  branch.  Cimcunnuy  factors  have  been  inlroduceil  to  keep  the  problem  solver  re.isonably 
focused.  A  certain  .imoiint  of  diversity  is  incorpomled  in  the  tilgorithm  to  incre.ised  the  likelihood  of 
discovering  assumptions  ili.it  can  be  m.ule  that  will  le.id  to  v.iluahle  information  quivkly.  Although  the  only 
problem  we  have  ex.imined  is  ciy  plarillimelic.  there  is  nothing  .ibout  these  general  sti.ilegics  ih.it  is  specific  to 
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cr^puirithmclic.  I  hcy  CDiuributc  lo  a  general  ilieory  of  parallel  problem  solv  ing. 

The  form  of  the  eude  is  quite  simple  to  write  and  understand.  Ibc  atg  iiiihm  consists  of  a  mixture  of 
constraint  propagation  and  p.irallcl  hypotliesi/e-and  tcst.  Ihe  programs  iiuulvc  asynchronous,  concurrent 
activities  processing  dilTerent  sets  id' assumptions.  I  'urihenMore,  the  resources  allixratcd  to  these  activities  can 
be  altered  asynchronously  with  die  execution  of  the  activities. 

We  h.ivc  demonstrated  that  introducing  concurrency  in  the  search  process  does  actually  increase  overall 
efriciency,  in  particular  it  Joes  no  harm,  nus  lends  support  to  efforts  to  design  a  computing  system  for 
message  passing  languages  that  involves  many  intercommunicating  autonomous  processors  (e.g,  Hewitt  (4|). 
It  suggests  there  is  inherent  concurrency  in  search  problems  that  could  be  gainfully  run  on  multiple 
processors. 
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